Scalable $k$-NN graph construction

نویسندگان

  • Jingdong Wang
  • Jing Wang
  • Gang Zeng
  • Zhuowen Tu
  • Rui Gan
  • Shipeng Li
چکیده

The k-NN graph has played a central role in increasingly popular data-driven techniques for various learning and vision tasks; yet, finding an efficient and effective way to construct k-NN graphs remains a challenge, especially for large-scale high-dimensional data. In this paper, we propose a new approach to construct approximate k-NN graphs with emphasis in: efficiency and accuracy. We hierarchically and randomly divide the data points into subsets and build an exact neighborhood graph over each subset, achieving a base approximate neighborhood graph; we then repeat this process for several times to generate multiple neighborhood graphs, which are combined to yield a more accurate approximate neighborhood graph. Furthermore, we propose a neighborhood propagation scheme to further enhance the accuracy. We show both theoretical and empirical accuracy and efficiency of our approach to k-NN graph construction and demonstrate significant speed-up in dealing with large scale visual data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

L1-graph construction using structured sparsity

As a powerful model to represent the data, graph has been widely applied to many machine learning tasks. More notably, to address the problems associated with the traditional graph construction methods, sparse representation has been successfully used for graph construction, and one typical work is L1graph. However, since L1-graph often establishes only part of all the valuable connections betw...

متن کامل

Supervised neighborhood graph construction for semi-supervised classification

Graph based methods are among the most active and applicable approaches studied in semi-supervised learning. The problem of neighborhood graph construction for these methods is addressed in this paper. Neighborhood graph construction plays a key role in the quality of the classification in graph based methods. Several unsupervised graph construction methods have been proposed that have addresse...

متن کامل

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is the main computational bottleneck in spectral clustering. In this work, we introduce a highly-scalable, spectrum-preserving graph sparsification algorithm that enables to build ultra-sparse NN (u-NN) graphs with guaranteed preservation of the original graph spectrums, such as the first few eigenvectors of the original gr...

متن کامل

Scalable Graph Building from Text Data

In this paper we propose NNCTPH, a new MapReduce algorithm that is able to build an approximate k-NN graph from large text datasets. The algorithm uses a modified version of Context Triggered Piecewise Hashing to bin the input data into buckets, and uses an exhaustive search inside the buckets to build the graph. It also uses multiple stages to join the different unconnected subgraphs. We exper...

متن کامل

An Output-Sensitive Approach for the L 1/L ∞ k-Nearest-Neighbor Voronoi Diagram

This paper revisits the k-nearest-neighbor (k-NN) Voronoi diagram and presents the first output-sensitive paradigm for its construction. It introduces the k-NN Delaunay graph, which corresponds to the graph theoretic dual of the k-NN Voronoi diagram, and uses it as a base to directly compute the k-NN Voronoi diagram in R. In the L1, L∞ metrics this results in O((n + m) log n) time algorithm, us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1307.7852  شماره 

صفحات  -

تاریخ انتشار 2013